System for adaptive real-time facial recognition using fixed video and still cameras

ABSTRACT

A system for facial recognition, consisting of a global database of facial characteristics of known users, a server which accepts from registration pictures taken on a user&#39;s camera device, accepts and processes training pictures and videos of other users to enable algorithm improvement, and a static camera which processes user pictures to determine whether the user is a known user or not based on the global database and a local database associated with each static camera.

FIELD OF THE INVENTION

The present invention generally relates to the recognition of persons based on their facial characteristics.

Systems for processing persons based on their identity have several common characteristics. They include a centralized database of information about the person or persons who can be processed, and one or more cameras. For entry systems, these cameras will be in fixed locations, with different lighting between cameras and depending on the time of day, weather conditions, etc.

There are also ways to fool these systems. One example of that is to hold a picture of the person up to the camera. Another is to have a picture of the person on an article of clothing, such as a t-shirt.

What is needed is a system for determining the identify of persons which can rapidly identify persons in varying lighting conditions, while overcoming some possible ways of getting around the system.

SUMMARY

A system for facial recognition, consisting of a global database of facial characteristics of known users, a server which accepts from registration pictures taken on a user's camera device, accepts and processes training pictures and videos of other users to enable algorithm improvement, and a static camera which processes user pictures to determine whether the user is a known user or not based on the global database and a local database associated with each static camera.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows an embodiment of the physical system including that necessary to enroll, train, and detect a known person.

FIG. 2 shows the details of one embodiment of the components of the system necessary to enroll, train and detect a known person.

FIG. 3 shows one or more embodiments of the details of the process needed to detect a known person.

FIG. 4 shows one or more embodiments of the details needed to train the system.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

FIG. 1 shows one embodiment of the system. A computer 104 is configured to accept pictures from one or more enrollment camera devices 102, such as cell phones, tablets or computers. The computer will determine facial characteristics of a known person in the picture, and determine the quality of the information. In one or more embodiments, the computer 104 will interact with the user via the enrollment camera device 102, prompting the user for identification information and verifying that the pictures contain adequate information by making sure that only one person is visible in the picture, and that there are adequate views of the person. In one or more embodiments, an action module 114 is called by the computer 104 based on the identification of the known person and location of the remote camera 110. For instance, if the remote camera 110 is over a door and the person identified is known by the system to have access to that door, one embodiment of the action module 114 will be to trigger a request to unlock that door. In one or more embodiments, the action module 114 is associated with a database which defines the attributes of the action. For instance, a person could only have access to certain doors, and then only during certain time periods.

A known person is a person of interest to the system, such as a person who needs access to a facility, or is a member of a group of interest, where the system will notify one or more action modules to react. An other person is a person not of interest but known to the system, whose facial characteristics can be compared to different persons (other or known) to improve the identification rules. In one such embodiment, this could be implements using a defined set of rules. For example, a rule which says if a person appears at a door during a specific time period, then the door is unlocked. Otherwise, the appearance is logged and ignored.

FIG. 2 shows one embodiment of the system with its software modules. The enrollment manager 202 accepts picture data and known person information from the enrollment camera 204. In one or more embodiments, the enrollment manager 202 determines the quality of the picture data on several levels. First, is there a single identifiable face in the pictures. Is it difficult to tell whether there is more than one person in the picture? Second, is the picture quality such that facial characteristics are difficult to recognize? Third, are there multiple useful views of the face (direct, left profile, right profile) such that adequately detailed facial characteristic data can be extracted from the information. In one or more embodiments, the features detected in different views are weighted by importance so that the values associated with features face forward get more weight than values associated with profile views, but the profile views still help to improve the calculations.

In one or more embodiments the software used to define the rules for finding known persons from the facial features uses an algorithm based on the convolution neural network to discover the relationships between facial features and known persons. Convolutional Neural Network is a type of neural network inspired by how the animal visual cortex works. They have a wide range of applications in image and video (see Wikipedia, https://en.wikipedia.org/wiki/Convolutional_neural_network).

If adequate information can be extracted by the enrollment manager 202, then the facial characteristics and known person information is saved in the universal database 208. The universal database 208 contains location independent information about a known person. In one or more embodiments, the universal rules module 216 compares multiple known persons information to create a set of rules to differentiate them. For instance, if there were only two known persons, and one had a ratio of distance between the eyes to distance between the ears of 0.7 and the other was 0.6, one rule might be if the ratio is less than 0.65 it is one known person and greater than 0.6 it is the other. In one or more embodiments, the ratios and values will vary based on the angle of view have a mean and variance, so that one has a probability of whether it is one known person or another.

At various fixed locations, the system interacts with one or more deployment cameras 214. Each deployment camera 214 is pointed at a background that will vary during the day in terms of light and objects. For instance, one location may be the entrance of a building that gets natural light during the day and one or more floodlights at night. Another may be in a hallway where different carts, pictures or other objects are stored against the opposite wall. In one or more embodiments, the placement of the deployment camera 214 is such that the expected size and distance of a person can be regulated. For instance, by placing the deployment camera above the height of an average person, then directing the person to stand some specified distance from the camera, the size of the face will not vary from a tall to a short person very much.

Associated with each deployment camera 214 is a deployment camera manager module 210. The deployment camera manager module 210 accepts picture data from the deployment camera 214 and removes the background information. This is done by recording the background information on a periodic basis so that it can be removed from the object that has inserted itself into the foreground. In one or more embodiments, the deployment camera manager module 210 adjusts the identification algorithm for changes in lighting. For instance, features may appear thicker in darker lighting and thinner in lighter conditions. The deployment camera manager module 210 identifies the face from the facial characteristics, then aligns the face to get a standard size face to work with.

Using the universal rules module 216, the identify of a known person is attempted. In one or more embodiments, a fixed probability threshold is used to determine whether or not this is a known person. On not exceeding that threshold, more pictures are accepted until either some lower threshold is reached or some time limit is exceeded. In one or more embodiments, this lower threshold will be a function of the number of acceptable quality pictures. For instance, there could be a fixed threshold of 95%, a threshold after 5 pictures or less of 90%, after 10 pictures of 90% and a timeout of 5 seconds.

In one or more embodiments, a per camera adapted database 212 is maintained for each deployment camera 214. In one or more embodiments, the per camera adapted database 212 is maintained over some period of time or some number of known users. Based on the picture quality and the facial data, this can be used to update rules stored in both the universal database 208 and the per camera adapted database 212.

In one or more embodiments, there are two ways to update the universal database from the per camera adapted database. First, calculate an image quality value for each picture based on contrast, and angle of view (profile vs face forward). Add high quality images to the universal database and associate them with the known person, effectively re-enrolling him.

Second, one can examine a group of persons from a specific deployment camera and, using convolutional neural networks or similar techniques, learn a mapping from the facial features detected by a specific deployment camera to the known person.

To prevent a user from being accepted by displaying a picture in front of a camera, there are several processing steps that can be used. In one or more embodiments, the deployment camera manager module 210 can look for multiple views of the user, either by examining multiple input frames or requesting that the user turn his head. In other embodiments, the head size can be assumed to be within a certain range such that if it is smaller or larger, it is assumed to be invalid (i.e. trying to hold a picture up to the camera).

In one or more embodiment, the deployment camera manager module interacts with a display that shows a short phrase for the user to read and examine the movement of the lips using a lip reading module associated with the deployment camera module. In one or more embodiments, the lip reading module tracks the location of landmarks on the lower and upper lip using standard motion based tracking. If the lips motion appears to be in line with utterances of that phrase within a determined probability score, then it is assumed to be a real person.

FIG. 3 shows one or more embodiments of the process of accepting new picture data. When the picture data is received, a background filter 302 is used to remove the background information from the person image. The person image is scanned to attempt feature recognition 304. In one or more embodiments, the features are detected and a process of alignment of the features are done to using the universal database 106 and the per camera adapted database 108 to detect what known person(s) it might be. In one or more embodiments, the alignment is done using convolutional neural network techniques to align and center a face. The rules associated with known persons can then be applied to calculate a match score. Based on a calculated match score 306, the result is to reject it or execute an action 308. In one or more embodiments, the match score is calculated by computing the distance of aligned features of an image from the new picture data compared to that of the universal database or per camera adapted database. The distance can be computed using either the dot product or euclidean distance between features. Finally, the database is updated through a process called re-enroll 310, where the new information is compared against a random subset of other known persons to improve the matching algorithms.

In one or more embodiments, the rules are further improved using training data from various sources, such as a set of photos of a known person, TED™ talks or YouTube™ videos. The training manager module 206 accepts a video or picture input. A user interacts with the training manager to identify the person of interest and provide some identifying information for him. In one or more embodiments, the training manager module will track that person through the video or set of pictures, extracting in some cases hundreds of usable pictures of that person and identifying facial features. That data so extracted can be used against the universal database to further refine rules for finding known persons. FIG. 4 shows one embodiment of the process of using training data. A video is selected by a user and presented to the training manager 402. The Training manager module 206 interacts with the user to select a face of interest 404, after which the training manager module 206 can track the person as he she moves through the video and acquire as many usable images of the face as possible. These images are then processed to recognize features 406, and the resulting data is compared to random sets of users to improve the rules 408. 

What is claimed is: 1) A system for facial recognition, the system comprising: a universal database, consisting of the facial characteristics of known persons, a computer, an enrollment manager module, coupled to the computer, configured to accept registration pictures and known person information from an enrollment camera, detect facial characteristics, align the facial characteristics for a standard size face, associate the facial characteristics with the known person information and save those facial characteristics in the universal database, a universal rules database, coupled to the computer, a universal rules module, coupled to the computer, the universal rules database, and the universal database, the universal rules module configured to accept data from the fascial characteristics of known persons and calculate rules to enable the differentiation of known persons from one another, and a deployment camera, coupled to the computer, configured to receive picture data. 2) The system in claim 1, further comprising: a per camera adapted database, coupled to the computer and deployment camera manager, configured to save facial recognition data associated with the deployment camera, and a camera rules module, coupled to the per camera adapted database, computer and universal rules database, configured to accept picture data, determine facial characteristics from the picture data, and send the update the universal rules database. 3) The system in claim 1, further comprising: a database of fascial characteristics of other persons, and a training module, coupled to the computer and other person database, configured to accept picture and video data and selection of an other person in the picture and video data, determine the facial characteristics of the other person and save that information in the other person database, where the training module compares one or more other person data to known person data, updates rules and stores the updated rules in the global rules database. 4) The system in claim 1, further comprising: An action module, coupled to the computer, which will perform a function on an external module based on the location of the remote camera and the identity of the known person. 5) A process for recognizing a person from his facial characteristics based on a set of pre-defined rules, the process comprising: accepting picture data from a camera, removing the static background from the picture data, determining the facial characteristics from the picture data, calculating a match score from the pre-defined rules, identifying the person based on the match score, and executing an action based on the location of the camera and the identity of the person. 6) The process in claim 5, calculating a match score further comprising: determining if the match score exceeds a maximum threshold, accepting more picture data to process 7) The process in claim 5, executing an action further comprising: updating the pre-defined rules using the picture data 8) The process in claim 5, executing an action further comprising: associating the picture data with the camera and the person, and defining a set of rules associating with the camera and the person. 9) A process for generating rules for recognizing a person from facial characteristics, the process comprising: accepting first picture data from a first user, accepting second picture data from a second user, detecting first facial characteristics from the first picture data, detecting second facial characteristics from the second picture data, comparing the first facial characteristics and the second facial characteristics, and calculating a set of rules to differentiate the first user from the second user. 10) The process in claim 9, further comprising: accepting third picture data from a first outside user, detecting third facial characteristics from the third picture data, comparing the third facial characteristics to the first facial characteristics and second facial characteristics, and updating the rules for the first user to differentiate the first user from the second user and third user. 11) The process in claim 9, where the first picture data comprises multiple views of the face, including a head on, right profile, and left profile. 